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ABSTRACT 



Device for generating multiple quality level bit- rates in a 
video encoder having a motion estimator providing a pre- 
dicted block for each predefined block based upon estimat- 
ing the motion between the predefined block of the current 
image and the corresponding block in the previous image, a 
transformer for transforming a prediction error resulting 
from the difference between the predicted block and the 
predefined block into the frequency domain, and a quantizer 
for quantizing the coefficients of the prediction error and 
providing the quantized coefficients to a video multiplex 
coding unit. Such a device includes a number n of stages, 
each corresponding to a quality level i=l to n, and each stage 
having a computer for reducing the prediction error in 
accordance with the quality level i and providing a corre- 
sponding quantized prediction error residual QDi, and an 
adder for obtaining a cumulative prediction error: 



QDjOTAl 



j=Q 



corresponding to quality level i 

wherein QD ; is the de-quantized value of the quantized 
prediction error residual QD ; .. Such a device is well suited to 
the current H.263 bitstream structure of the ITU recommen- 
dation H.324/H.323 by using the sub-bitstream feature with- 
out the need to change the bitstream structure. 

8 Claims, 4 Drawing Sheets 
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DEVICE FOR GENERATING MULTIPLE providing a predicted block for each predefined block based 

QUALITY LEVEL BIT-RATES IN A VIDEO upon estimating the motion between the predefined block of 

ENCODER the current image and the corresponding block in the pre- 
vious image, transform means for transforming a prediction 

This application is a continuation in part of Ser. No. 5 error resulting from the difference between the predicted 

09/103,405 filed Jun. 24, 1998, now abandoned and assigned block and the predefined block into the frequency domain, 

to the same assignee as that of the present invention. and quantizing means for quantizing the coefficients of the 

TFPHNirAT fift n prediction error in the frequency domain and providing the 

lhCHiNi^AL MtLD quantized coefficients to a video multiplex coding unit, 

The present invention relates to the video encoding stan- wherein the quantized coefficients are de-quantized and 

dard H.263 developed by the International Telecommunica- 10 inverse transformed to give back the prediction error and 

tion Union (ITU) for very low bit-rate multimedia telecom- add it to the predicted block whereby the result is provided 

munication and particularly to a device for generating to the motion estimation means in order to get a new current 

multiple quality level bit-rates in a H.263 video encoder. predicted block. Such a device which generates, from one 

video sequence, multiple video bitstreams of different bit- 

BACKGROUND 15 rates an( j corresponding to different quality levels comprises 

The H.263 standard developed by the ITU (International a number n of stages corresponding each to a quality level 

Telecommunication Union) is a part of its H.324/H.323 i=l to n, each stage comprising computing means for 

recommendations for very low bit-rate multimedia telecom- reducing the prediction error in accordance with the quality 

munication. The H.263 coding scheme which is described in level i and a corresponding quantized prediction error 

"Video Coding for Very Low bit-rate Communication", *0 residual QDi, and summing means for obtaining a cumula- 

Draft ITU-Recommendation H.263, May 1996, is based on tive prediction error: 
earlier schemes used in H.261 and MPEG-1/2 standards, and 
using a Hybrid-DPCM concept comprising a motion 
estimation/compensation mechanism, transform coding and 

quantization. Each image is divided into blocks of size 25 
16x16 pixels (called macroblocks) and the macroblock in 

the current picture is predicted from the previous picture corresponding to quality level i. 

using motion estimation techniques. After the prediction, the wherein QD y is the dequantized value of the quantized 

macroblock is divided into four blocks of size 8x8 pixels. prediction error residual QD y . 

The prediction error is then transformed using the Discrete 30 The invention further provides a way to efficiently store 

Cosine Transform (DCT) and the resulted coefficients are up to four compressed video sequences corresponding to up 

quantized and stored in the bitstream along with the motion to four different quality level compression of the same video 

parameters and other side information. The H.263 standard sequence. 

contains several improvements compared to earlier stan- ™ e ma J or ^vantage of the invention is to save storage 

dards which allow a substantial reduction in the bit-rate 35 space or transmission bits when this data is respectively 

while maintaining the same image quality. These improve- stored as a data ^ or sent on a transmission line because the 

ments make it most suitable for very low bit-rate commu- ™ tl0n vector "formation are only stored once for the four 

nication (but do not exclude it from being used in high different compressed video sequences. In addition, only one 

bit-rate compression as well). file *° T *&) or ***** ( f ° r '^mission) * ^d as 

„ T1 . . ac * L c *u 40 opposed to the normal case where each compressed video 

The H.263 bit-stream syntax defines the structure of the u ^ . , „ , . , , rt tU a . e 

. , . . f . • 1 j * * it t * • -n, sequence is handled independently. One other advantage of 

coded data from the basic block data to the entire image. The ... , • *u * -j j a ui * Jo^ 

e t , l a ' a u tne solution is that video decoders able to process H.263 

quality of the reconstructed video sequence can be con- . , , ,. t , . 4 . , r , . 

. „ i . . • iU t u~ standard bitstreams but not modified according to the new 

trolled by changing the quantization step in the encoding . , , P , . . .„ , , ui *u 

3 j. f j £ j , *i u • storing of data of the invention, will be however able, with 

process according to a pre-defined rate control mechanism. . to , , , - . 

*L . n a . v f ■ r • . 45 no change, to reconstruct the base layer of the compressed 

This allows a flexibility in generating a video sequence °, , ,f? ,. «. . r .. J . ... , J 

j. . . . . .% * • * „„„,;,„ data while discardmg the rest of the sub-bitstreams it cannot 

according to a desired bit-rate or image quality. decode 

When a system is designed to transmit video content to a 

wide range of communication channels (video servers or BRIEF DESCRIPTION OF THE DRAWINGS 

video-on-demand), it is highly desired to be able to use one The objects, features and other characteristics of the 

compressed video sequence which can accommodate all invention will become more apparent from the following 

needs as opposed to keeping several versions of the com- detailed description with reference to the accompanying 

pressed video sequence each compressed to a different drawings in which: 

bit-rate according to the target communication channel. FIG. 1 represents a block-diagram of a H.263 encoder of 

OBJECTS OF THE INVENTION 55 the prior art wherein a device according to the invention is 

Therefore, the main object of the invention is to provide USC J? r ' ». , . , t . .. .„ . 

a device incorporated in a video encoder for accommodating , 2 15 a *? hcmatic ^lock-diagram illustrating the way 

a wide range of communication channels without storing of grating the quantized prediction error residuals cor- 

several video bitstreams. responding to increased quality levels in the device accord- 

Another object of the invention is to provide a deviceused <"> m S Sj° the mvenUon, and 

in a video encoder for supplying different quality level video ; 3 15 a schematic block-diagram representing the 

bitstreams by using the same stored video sequence with summing means of the device according to the invention for 

different quantization slep sizes. S etlin S a cumulative prediction error. 

FIG. 4 shows the bitstream data structure according to the 

BRIEF SUMMARY OF INVENTION 65 h.263 May 1996 draft being the support for storing four 

Accordingly, the device according to the invention is used quality levels of the same video sequence according to the 

in a video encoder comprising motion estimation means invention. 
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DETAILED DESCRIPTION OF THE 
INVENTION 

The H.263 encoder in which the device according to the 
invention is used, is based on the hybrid-DPCM scheme 
which is used in most of the standard video coders today. S 
When the encoding function has to be implemented, Coding 
Control Unit 10 controls switching circuits 12 and 14 as 
illustrated in FIG. 1 which consists in a video encoder of the 
prior art same as FIG. 3 of the Draft ITU-Recommendation 
H.263, May 1996. in 

The encoder is composed of a transform coder 16 which 
transforms a prediction error found by subtracting in sub- 
tractor 18 the current macroblock received as in input from 
a predicted corresponding macroblock into the frequency 
domain by using the Discrete Cosine Transform (DCT) in 15 
which the information is represented in a compact way 
suitable for compression. The DCT coefficients are then 
quantized in quantizer 20 with many fewer bits. This 
quantization, which provides a quantizing index q for trans- 
form coefficients, introduces the lossy aspect of the video ^ 
encoder. 

The predicted macroblock used to determine the predic- 
tion error is provided by a motion estimation unit 22 which 
provides motion vectors v pointing to the chosen macrob- 
lock in the previous image. 25 

The prediction error, and the motion vectors v, form the 
information needed for the reconstruction process in the 
decoder. Indeed, the prediction of the current macroblock is 
performed with respect to the previous reconstructed image 
in a similar way as is done in the decoder to avoid any 30 
mismatch. To achieve this, a complete decoder is actually 
implemented in the encoder loop. All the information sent by 
the encoder to the decoder is coded using Hufman coding 
which represents the bits in a compact and efficient way. The 
reconstruction is performed by taking the quantized and 35 
transformed prediction error and performing inverse quan- 
tization in inverse quantizer 24 and performing inverse DCT 
(I DCT) in inverse Transform Coder 26. Then, the macrob- 
lock predicted from the previous reconstructed image is 
added to the prediction error in adder 28 to form the current 40 
reconstructed block provided to motion estimation unit 22. 
Note that this mechanism is performed to each macroblock 
of the image. 

The control information from Coding Control Unit 10, 
quantizing index q for transform coefficients and motions 45 
vectors v are then provided to the Video Multiplex Coding 
Unit 30. 

In the encoder of FIG. 1, the quantization performed by 
quantizer 20 is a process in which the DCT coefficients of 
the prediction error PE which can have many values in a 50 
specific range are converted into other values that are chosen 
from a much smaller subset in that range. The number of 
possible values is determined by the quantization step size. 
This parameter determines the number of levels in the range 
and hence the number of possible values that the coefficients 55 
can have after quantization. 

The inverse quantization performed by inverse quantizer 
24 is the inverse process in which the coefficients are 
transformed back to the original domain with values from 
that domain. The reconstructed value can be different from 60 
the original due to the quantization effect. The quantization 
error is computed by subtracting the reconstructed coeffi- 
cients from the original coefficient values, the difference 
representing the error introduced by the quantization pro- 
cess. Note that, to be able to get back to the original domain 65 
of values, the inverse quantization step size must be identical 
to the quantization step size. 
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An essential feature of the invention is to reduce the 
quantization error by using several quantization stages 
wherein the error in each stage is reduced by decreasing the 
quantization step size (since the coefficients are divided by 
this step size). This is illustrated in FIG. 2. 

In FIG. 2, the coefficients of the prediction error PE (after 
DCT transformation) are quantized in quantizer 40 with the 
quantization step value Q 0 . Therefore, QD 0 is a coarse 
representation of the DCT coefficients which would nor- 
mally be used in the encoder without the invention or the 
quantized prediction error in the base level. 

At each level i of the mechanism according to the 
invention, E ( represents the prediction error residual as being 
the difference between the prediction error residual E ( _j of 
level i-1 and the de-quantized value of the quantized pre- 
diction error residual QD^ by inverse quantizer IQ^.j. 
Thus, the de-quantized value of QD 0 obtained from inverse 
quantizer 42 (having an inverse quantization step size IQ 0 
equal to quantization step size Q 0 ) is subtracted from 
prediction error PE in subtractor 44 to get prediction error 
residual E r Then, residual E 1 is quantized in quantizer 46 
having a quantization step size Q u smaller than Q 0 , in order 
to obtain QD 3 which is the quantized prediction error 
residual in level i-1. The de-quantized value of QD 1 
obtained from inverse quantized 48 (with an inverse quan- 
tization step size ICh equal to Qj) is subtracted from residual 
Ej in subtractor 50 to get residual E 2 . is then quantized 
in quantizer 52 having a quantization step Q 2 smaller than 
Q 1 in order to obtain QD 2 . The de-quantized value of QD 2 
obtained at the output of inverse quantizer 54 (having an 
inverse quantization step size IQ 2 equal to Q 2 ) is subtracted 
from residual E 2 in subtracter 56 to get residual E 3 . Finally, 
residual E 3 is quantized in quantizer 58 (having a quantiza- 
tion step value Q 3 smaller than Q 2 to obtain QD 3 . 

As can be seen in FIG. 2, at each stage, the quantized 
prediction error residual is computed from the prediction 
error residual of the previous stage. No information is lost 
between stages since, each time, the error between the input 
signal and the quantized signal is completely transferred to 
the next stage. The lossy nature of the compression 
(quantization error) is introduced only at the output of the 
final stage where the smallest quantization step is used and 
so, the error is minimal. Note that since the more stages there 
are, the smaller the prediction error will be, more than four 
stages could be used to be closer to the original prediction 
error. 

Since the residual from the previous stage is quantized 
using a finer quantizer, the DCT coefficients can be rebuilt 
in an accumulative mechanism illustrated in FIG. 3. In such 
a mechanism, the de-quantized values of QD 0 to QD 3 by 
respectively inverse quantizers IQ 0 , IQ a , IQ 2 , IQ3, are 
summed in summing circuit 60 to get a value QD ro2Xi 
according to 

h 

QDtotal = Yj QD ) 

wherein i can take any value 0, 1, 2, or 3. The quantized 
value QD romL is then de -quantized in inverse transform 
coder 26 as usual. 

Thus, the rebuilding mechanism can stop at any level i if 
the desired bit- rate or picture quality has been achieved 
(according to the restrictions of the communication channel 
or the decoder at the other end). In the present case, 
QDtotal can De obtained by using QD 0 alone, or QDo+QDj, 
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or QDq+QD^QDs or QDo+QDj+QDj+QDg. In this way, 
up to four versions of the same compressed video sequence 
can be produced from one bitstream. It must be noted that 
QD 0 being the base prediction error and QD V QD 2 , QD 3 
being components of the errors, when these components are 5 
added to QD 0 , the result is closer to the original value before 
quantization. In other words, the greater is i in the above 
formula giving QD roTAL , the better will be the quality of the 
video image. 

In the preferred embodiment of the invention, up to four 10 
different compressed video sequences corresponding to up 
to four different quantization step sizes can be stored as a 
unique video sequence. This is done by using the existing 
bitsiream and sub-bitstream structure of a compressed video 
sequence as defined by the H.263 standard but applied to 15 
specific data. This data structure can be used to store 
efficiently this multi -level video sequence as a data file or to 
transfer it on a transmission line. With this embodiment, one 
can use the ability of the H.263 coder to handle up to four 
sub-bits treams without the need to change the bit-stream 
structure. Such a structure is described in "Video Coding for 
very Low bit-rate Communication", ITU-T Recommenda- 
tion H.263, May 1996, 

The data structure defined by H.263 standard is illustrated 
in FIG. 4. The picture layer is the upper layer the other data 
structures 'Group Of Blocks layer', 'Macroblock Layer' and 
* Block layer' are embedded one in the other for providing all 
the video pixel block information for all the blocks of a 
picture. One bit (CPM) indicates the usage of sub- 
bitstreams. Then, two bits which are present if the CPM bit 
if activated, allow up to four independent sub-bitstreams to 
be defined within the total H.263 bitstream. Picture Sub 
Bitstream Indicator (PSBI) are two bits which are present 
only if CPM is indicated. They indicate that the picture 
header and all following information until the next picture of 
Group of Blocks (GOB) headers belong to the same sub- 
bitsream. Group Sub Bitstream Indicator (GSBI) are two 
bits which are present only if CPM is activated. They 
indicate that the GOB header and all following information 
until the next picture or GOB start code belongs to the same 40 
sub-bitstream. This mode is provided to transfer up to four 
independent bitst reams: annexe C in H.263 standard draft, 
May 1996, states that 'The information in each individual 
bitstream is also completely independent from the informa- 
tion in the other bitstreams*. 45 

The video encoder of the preferred embodiment of the 
invention generates the multi-quality level information and 
stores up to four quality level encoded video sequences of 
the same video sequence as four sub-bitstreams in a unique 
H.263 bitstream. For the preferred embodiment, the use of 50 
CPM bit in the H.263 bitstream means that there are multiple 
quality level information included as sub-bitstreams. PSBI 
qualifies which quality- level it is. The GSBI field will 
indicate to which quality-level the following group of blocks 
belong. The Macroblock Layer and Block Layer have the 55 
same use than with the usual H.263 bitstream, these are the 
macroblock and block encoded information. 

Note that the motion vectors information which accom- 
pany each macroblock should be stored only in the base 
bitstream. This information is used by all the other sub- 60 
bitstreams. Quantization information is stored in the usual 
way for each sub bit-stream (if needed, according to the 
rate-control mechanism used by the coder). The only mac- 
roblock information included in the sub-bitstreams apart for 
the base one are the quantized residual values, so the 65 
overhead of using the sub-bitstream structure and the pro- 
posed scaleability mechanism is minimal. 



20 



25 



30 



35 



The decoder of the preferred embodiment receives the 
bitstream as described and is able to use some or all of the 
bitstreams to reconstruct the video sequence at a quality (or 
bit rate) which matches the decoder needs. If, according to 
the bandwidth of the communication channel used, only a 
subset of the bitstream can be sent by the video encoder 
through the communication channel, the video decoder will 
reconstruct the video sequence at a quality which will match 
the channel bandwidth. 

What is claimed is: 

1. In a video encoder comprising motions estimation 
means (22) providing a predicted block for each predefined 
block based upon estimating the motion between said pre- 
defined block of a current image and the corresponding 
block in a previous image, transform means (16) for trans- 
forming a prediction error resulting from the difference 
between said predicted block and said predefined block into 
the frequency domain, and quantizing means (20) for quan- 
tizing coefficients of the prediction error in the frequency 
domain and providing the quantized coefficients to a video 
multiplex coding unit (30), wherein said quantized coeffi- 
cients are de-quantized (24) and inverse transformed (26) to 
give back said prediction error and add it to said predicted 
block whereby the result is provided to said motion estima- 
tion means in order to get a new current predicted block; 

a device comprising means for generating from one video 
sequence, multiple video bitstreams of different bit- 
rates and corresponding to different quality levels, said 
device further comprising: 

means for building a hierarchical H.263 bitstream 
including at least one and up to four substreams each 
corresponding to one different quality level gener- 
ated from said video sequence, a CPM bit of the 
H.263 bitstream being set to 1, a PSBI 2 bit field 
being set to the number of different quality level 
substreams stored, a first Group Of Block compris- 
ing a 2 bit field GSBI being set to the first quality 
level bitstream, the corresponding Macroblock layer 
comprising MVD, MVD2, MVD3, MVD4, MVDB 
bit fields storing motion information generated from 
said video sequence for said first quality level 
bitstream, following Group Of Blocks comprising a 
2 bit field GSBI being set to the value corresponding 
to the range of the following quality level bitstreams 
wherein the corresponding Macroblock layer MVD, 
MVD2, MVD3, MVD4, MVDB bit fields do not 
store motion information. 

2. Device according to claim 1 further comprising: 

a number n of stages, each said stage corresponding to a 
quality level i=*l to n, each said stage comprising 
computing means for reducing the prediction error in 
accordance with said quality level i and providing a 
corresponding quantized prediction error residual QD„ 
and 

summing means for getting a cumulative prediction error 



Q*>n 



corresponding to quality level i. 

wherein QD y is a dequantized value of the quantized pre- 
diction error residual QD, 



3. Device according to claim 2, wherein said computing 
means in each one of said stages comprise: 
means for determining a prediction error residual E, 
corresponding to quality level i, said residual being the 
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difference between prediction error residual E,^ of 
quality level i-1 and the de-quantized value of the 
quantized prediction error residual QD,., obtained in 
the stage corresponding to quality level i-1, and 
quantizing means for quantizing said prediction error 5 
residualE, according to a quantization step size smaller 
than the quantization step size of the stage correspond- 
ing to quality level i-1. 

4. Device according to claim 3, wherein said quantizing 
means in each stage corresponding to quality level i is a 10 
quantizer (40, 46, 52 or 58) having a quantization step size 

Q ( smaller than the quantization step size of the stage 
corresponding to quality level i-1. 

5. Device according to claim 4, wherein in each stage 
corresponding to quality level i, the quantized prediction 15 
error residual QD, is de-quantized in an inverse quantizer 
(42, 48, or 54) having a quantization step size IQ f equal to 
the quantizer step size Q, used in quantizer of said stage. 

6. Device according to any one of claims 5, wherein the 
input of the first stage corresponding to quality level i=l is 20 
said prediction error and the output of said stage being 
prediction error residual E,-. 

7. Device according to claim 6, wherein the de-quantized 
values QD, for i=0 to 3 are respectively obtained by means 

of inverse quantizers using the same quantization step sizes 25 
IQ 0 , IQi, IQ2 or as tne quantization step sizes Q 0 to Q 3 
of quantizers providing said quantized prediction error QD 0 
and quantized prediction error residuals QDj, QD 2 or QD 3 . 

8. In a video encoder comprising motions estimation 
means (22) providing a predicted block for each predefined 30 
block based upon estimating the motion between said pre- 
defined block of the current image and the corresponding 
block in the previous image, transform means (16) for 
transforming a prediction error resulting from the difference 
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between said predicted block and said predefined block into 
the frequency domain, and quantizing means (20) for quan- 
tizing the coefficients of the prediction error in the frequency 
domain and providing the quantized coefficients to a video 
multiplex coding unit (30), wherein said quantized coeffi- 
cients are de-quantized (24) and inverse transformed (26) to 
give back said prediction error and add it to said predicted 
block whereby the result is provided to said motion estima- 
tion means in order to get a new current predicted block; 
a device comprising means for generating a video 
sequence from reading a bitstream of a compressed 
video sequence of different specific bit-rate and corre- 
sponding to a specific quality level, said device further 
comprising: 

means for reading a hierarchical H.263 bitstream 
including at least one and up to four substreams each 
corresponding to one different bit-rate corresponding 
to different quality level generated from one video 
sequence, a CPM bit of the H.263 bitstream being set 
to 1, a PSBI 2 bit field being set to the number of 
different quality level substreams stored,the first 
Group Of Block comprising a 2 bit field GSBI being 
set to X'Or for the first quality level bitstream, the 
corresponding Macroblock layer comprising MVD, 
MVD2, MVD3, MVD4, MVDB bit fields storing 
motion information for said first subitstream, and the 
following Group Of Blocks if any, comprising a 2 bit 
field GSBI being set to the value corresponding to 
the range of the corresponding quality level 
bitstream, the motion information being the content 
of the MVD, MVD2, MVD3, MVD4, MVDB bit 
fields read in the first subitstream. 

***** 
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